You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The disassembleBinary function is called with a fixed argument "-fun 1 -c". This might not be the correct or optimal argument for all cases. It should be validated that this argument works as expected and consider making it configurable.
The toggleOldCode, toggleNewCode, and toggleDiff functions in inline_javascript.js have a lot of repeated code. This can be refactored to reduce duplication and improve maintainability.
["git", "branch", "--quiet", "--color=never", self.full_hash],
capture_output=True,
)
.stdout.strip()
.decode("utf-8")
.splitlines()
):
# Possible output:## main# * scalar_seg_edges## In this case, we have checked out the HEAD of the# scalar_seg_edges branch. Here we just strip the *.ifline[0] =="*":
line=line[2:]
in_branches.append(line)
defgit_show(fmt) ->str:
return (
subprocess.run(
[
"git",
"show",
"--no-patch",
f"--format={fmt}",
self.full_hash,
],
capture_output=True,
)
.stdout.strip()
.decode("utf-8")
)
self.title=git_show("%s")
self.author_name=git_show("%an")
self.author_email=git_show("%ae")
self.author_time=git_show("%ad")
self.commit_time=git_show("%cd")
@dataclass_json@dataclassclassLaunchParams:
blockDim: tuple[int]
gridDim: tuple[int]
dynamic_smem_bytes: int@dataclass_json@dataclassclassCompiledKernel:
filename: strcuda: str|None=Noneptx: str|None=Nonesass: str|None=Noneptxas_info: str|None=Nonelaunch_params_str: str|None=Nonelaunch_params: LaunchParams|None=Nonegmem_bytes: int=0smem_bytes: int=0cmem_bank_bytes: list[int] |None=Noneregisters: int|None=Nonestack_frame_bytes: int=0spill_store_bytes: int=0spill_load_bytes: int=0mangled_name: str|None=Nonearch: str|None=Noneindex_type: str|None=Nonedef__post_init__(self):
self.parse_ptxas()
self.parse_launch_params()
defparse_ptxas(self):
# Example input:## ptxas info : 307 bytes gmem# ptxas info : Compiling entry function# '_ZN76_GLOBAL__N__00000000_37___tmp_kernel_pointwise_f0_c1_r0_g0_cu_8995cef2_3255329nvfuser_pointwise_f0_c1_r0_g0ENS_6TensorIfLi2ELi2EEES1_S1_'# for 'sm_86'# ptxas info : Function properties for# _ZN76_GLOBAL__N__00000000_37___tmp_kernel_pointwise_f0_c1_r0_g0_cu_8995cef2_3255329nvfuser_pointwise_f0_c1_r0_g0ENS_6TensorIfLi2ELi2EEES1_S1_# ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads# ptxas info : Used 203 registers, 16 bytes smem, 472 bytes cmem[0], 8 bytes cmem[2]## Here we parse this into the fields presented, and we replace the# mangled kernel name since it includes the kernel number and is# useless for the purposes of diffing since the kernel signature is# already included.ifself.ptxas_infoisNone:
returnm=re.search(r"Compiling entry function '(.*)' for '(.*)'", self.ptxas_info)
ifmisnotNone:
self.mangled_name, self.arch=m.groups()
deffind_unique_int(pattern) ->int|None:
assertself.ptxas_infoisnotNonem=re.search(pattern, self.ptxas_info)
return0ifmisNoneelseint(m.groups()[0])
self.stack_frame_bytes=find_unique_int(r"(\d+) bytes stack frame")
self.spill_store_bytes=find_unique_int(r"(\d+) bytes spill stores")
self.spill_load_bytes=find_unique_int(r"(\d+) bytes spill loads")
self.registers=find_unique_int(r"(\d+) registers")
self.gmem_bytes=find_unique_int(r"(\d+) bytes gmem")
self.smem_bytes=find_unique_int(r"(\d+) bytes smem")
self.cmem_bank_bytes= []
cmem_banks=0forminre.finditer(r"(\d+) bytes cmem\[(\d+)\]", self.ptxas_info):
nbytes_str, bank_str=m.groups()
bank=int(bank_str)
iflen(self.cmem_bank_bytes) <=bank:
self.cmem_bank_bytes+= [0] * (bank+1-len(self.cmem_bank_bytes))
self.cmem_bank_bytes[bank] =int(nbytes_str)
cmem_banks+=1defparse_launch_params(self):
# If NVFUSER_DUMP=launch_param is given we will get a line like this for every launch:# Launch Parameters: BlockDim.x = 32, BlockDim.y = 2, BlockDim.z = 2, GridDim.x = 8, GridDim.y = 8, GridDim.z = -1, Smem Size = 49152# This is not done by default since we might have hundreds of thousands of these lines.# Still, if we recognize it, we will parse this info. If there are# multiple lines, we just check that they are all equal and if not then# we keep the first version and print a warning.ifself.launch_params_strisNone:
returnforlineinself.launch_params_str.splitlines():
The sanitize_sass_lines function uses a complex regex to demangle kernel names. This regex might not cover all possible cases and could fail for certain kernel names. It should be tested thoroughly to ensure it handles all expected inputs.
"""Remove comments and remove kernel id"""sanitary_lines= []
forlinlines:
# Replace mangled kernel names like# _ZN76_GLOBAL__N__00000000_37___tmp_kernel_pointwise_f0_c1_r0_g0_cu_8995cef2_3255329nvfuser_pointwise_f0_c1_r0_g0ENS_6TensorIfLi2ELi2EEES1_S1_# or# _ZN76_GLOBAL__N__00000000_37___tmp_kernel_4_cu_8995cef2_3255329nvfuser_4ENS_6TensorIfLi2ELi2EEES1_S1_# or# _ZN57_GLOBAL__N__00000000_18___tmp_nvfuser_5_cu_badbb5a6_975149nvfuser_5ENS_6TensorINS_6__halfELi3ELi3EEES2_NS_9TensorMapES3_NS0_IS1_Li2ELi2EEE,(.L_x_28 - _ZN11kernelscope6kernelENS_6TensorINS_6__halfELi3ELi3EEES2_NS_9TensorMapES3_NS0_IS1_Li2ELi2EEE)# with# _ZN11kernelscope6kernelENS_6TensorIfLi2ELi2EEES1_S1_# demangle first two parts after _ZN and replace with "kernelscope" and "kernel"m=re.match(r"^(?P<prefix>^.*\b_Z?ZN)(?P<scopenamelen>\d+)_", l)
ifmisnotNone:
d=m.groupdict()
scopenamelen=int(d["scopenamelen"])
# demangle second part in remainder after scope nameremainder=l[(len(d["prefix"]) +len(d["scopenamelen"]) +scopenamelen) :]
mrem=re.match(r"^(?P<varnamelen>\d+)", remainder)
ifmremisnotNone:
drem=mrem.groupdict()
varnamelen=int(drem["varnamelen"])
remainder= (
"6kernel"+remainder[len(drem["varnamelen"]) +varnamelen :]
)
l=d["prefix"] +"11kernelscope"+remainder
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is almost working but there are problems with the javascript that need to be debugged.